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ABSTRACT 

The construct validity of the standardized-patient 
(SP) examination used at Southern Illinois University (Springfield) 
School of Medicine was assessed by comparing 66 second-year and 70 
fourth-year medical students on 5 SP cases. The results show sizable 
differences between the groups. The usefulness of passing rates the 
effect-size measures as a means of enhancing the typically weak 
evidence for validity provided by group-differences studies of 
construct validity is demonstrated. Results obtained through these 
approaches show that, as would be expected, the clerkships are having 
a considerable effect on clinical competence and that the examination 
is sensitive, as a valid measure of clinical competence should be, to 
these changes in the clinical-competence construct. A table 
summarizes passing rates and group means. (Author/SLD) 
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ABSTRACT 

The construct validity of the standaxdized-patient (SP) examination used 
at Southern Illinois University School of Medicine was assessed by comparing 
second-year and fourth-year medical students on five SP cases. The results 
showed sizeable differences between the groups. The paper demonstrates the 
usefulness of passing rates and effect-size measures as a means of enhancing 
the typically weak evidence for validity provided by group-differences studies 
of construct validity. The results obtained with these approaches show that 
the clerkships are having a considerable effect on clinical competence as 
would be expected and that the examination is sensitive to these changes in 
the clinical-competence construct as a valid measure of clinical competence 
should . 



The use of standardized patients (SPs) for assessing clinical competence 
has increased rapidly in recent years. ^ In response to this increase, a large 
body of research on the psychometric properties of SP-based assessments has 
emerged. 2 Much of this research has dealt with reliability. These studies 
have demonstrated the reproducibility of examination scores and pass-fail 
decisions^ and have shown that the reproducibility is generally unaffected by 
potentially disruptive f actors. ^'"^ A few studies have focused on the 
construct validity of these assessments by showing that groups of examinees at 
more advanced levels of training perform better on the SP cases than do 
examinees at earlier levels. ^'^ The rationale for this group-differences 
approach to construct validity is that the validity of a measure of a given 
construct is supported if the measure is sensitive to and reflects differences 
among groups thought to differ on the construct.^° 

Three of the group-differences studies compared residents at different 
points in their residency training programs .^'^'"^ Another study compared first- 
and second-year residents with third-year medical students,^ and another 
compared fifth- and sixth-year medical students and residents.^ In general, 
the results of these studies supported the construct validity of the SP-based 
assessments of clinical competence, by showing that the clinical competence 
construct was increasing with additional clinical training as would be 
expected and as should be reflected by a valid measure of clinical competence. 

At Southern Illinois University (SIU) School of Medicine, a performance- 
based examination of clinical competence that uses SP cases is given to all 
senior medical students upon completion of their clinical clerkships . ^^'^2 
Students are expected to pass the examination to fulfill a part of their 
graduation requirements. The purpose of this Post-Clerkship Examination is to 
determine if students have mastered the clinical competencies expected of 
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students upon completion of their clerkships as defined by School of Medicine 
objectives . 

The present study was conducted to assess the construct validity of the 
SIU SP-based Post-Clerkship Examination. To accomplish this^ five SP cases 
that were used as a part of the Post-Clerkship Examination administered to 
seniors in the class of 1990 were selected for administration to second-year 
students in the class of 1992 upon completion of their Introduction to 
Clinical Medicine course. Thus, the present study provides a comparison of 
second-year and fourth-year medical students, groups tested before and after 
their first major clinical e;<perience, respectively, in the clinical 
clerkships, in addition to comparing the means of the second- and fourth-year 
students which is the typical analytic approach that has been used in group- 
differences studies of construct validity, in this study the passing rates of 
the two groups were compared and strength of effect measures were computed, to 
provide a quantitative indication of the impact of the clerkship training on 
clinical competence and the sensitivity of the Post-Clerkship Examination to 
differences between the second- and fourth-year student groups. 

Methods 

The ExarninafJon. A thorough discussion of the SIU Post-Clerkship 

Examination including details of the development, administration and scoring 
is presented elsewhere and should be consulted for a full description of the 
examination. ll'l2 brief, the examination is a performance-based 

examination that uses about 18 forty minute SP cases (20 minutes for the 
student-SP encounter and another 20 minutes immediately following the 
encounter for students to answer written questions about the case) . Cases for 
the examination are chosen by the faculty Post-Clerkship Examination committee 
and represent the most frequently encountered patient problems as well as the 



most important patient problems that students are expected to. evaluate and 
manage competently. Competencies to be assessed by each case are determined 
by the faculty committee. Faculty physicians provide the patient cases and, 
with an educator, develop the instruments for collecting student performance 
data. These data consist of checklists completed by SPs who record actions 
performed by students on history and physical, and written responses by 
students following the patient encounters to questions concerning findings, 
tentative diagnostic conclusions, and plans for treatment and management. For 
each case, a passing level is established by the case author and reviewed by 
the faculty committee. This Case Pass Level reflects the standards of 
performance expected of senior medical students upon completion of their 
clerkship rotations as expressed by minimal scores on the clinical 
competencies being assessed by the case. For this study, a student was said 
to have passed the full examination, consisting of all five SP cases, if his 
or her total examination score (i.e., the mean of all of the student's case 
scores) exceeded the mean of the Case Pass Levels. 

Analytic Me^rhod^ . For the present study, five SP cases were randomly 
selected from 13 of the 18 cases that, comprised the Post-Clerkship Examination 
given to the class of 1990 (n = 70) in their fourth year of medical school 
(tested in October, 1989). Five cases that emphasized management, pediatrics, 
or OB/GYN problems were excluded from the random selection process because 
they were not thought appropriate for second-year students. The five cases 
selected were then administered to second-year students in the class of 1992 
(n = 66) upori completion of their Introduction to Clinical Medicine course 
(tested in May, 1990) . The fourth-year students were tested after completion 
of all clerkship rotations and the second-year students were tested before 
their clerkships were started. The means and passing rates of the second-year 
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and fourth-year students were compared on each of the five selected cases with 

2 

t tests and X tests, respectively. Dif f erer.<^dS between the means in pooled 
standard deviation units and odds ratios vere computed, in order to assess the 
strength of the effect oz the clinical clerkship experience and the 
sensitivity of the examination to the effect of this experience. 

Results and Discussion 
The results of the statistical tests provide good support for the 
construct validity of the SP assessment. (See Table 1.) For total scores, 
which are averages across all 'five cases, the mean of the fourth-year group 
(74.59) was significantly higher than Chat of the second-year group (62.43) 
(p = .0001), and the passing rate of the fourth-year group (70%) was 
significantly higher than that of second-year group (3%) Xp = .001). For all 
five cases, the means and passing rates were significantly higher for the 
fourth-year students than for the second-year students (p < .05). 

However, as pointed out by writers from Cronbach and Meehl^^ to van der 
Vleuten and Swanson^, the demonstration of significant group differences 
constitutes at best weak evidence for validity. The problem, in part, is that 
it is not clear how much the groups should differ. If an expected difference 
in the amount of a construct (e.g., clinical competence) possessed by members 
of the different groups were specified, an empirical confirmation of the 
specified difference with the measuring instrument would provide strong 
support for the construct validity of the measure. To address this problem, 
passing rates and strength of effect measures were computed, in addition to 
the tests of significance on the means. Although it is not clear how big the 
group differences should be, the effect size measures indicate how big they 
are, so that it is possible to judge, at least intuitively ^ whether they seem 



Co be of reasonable magnitude given our concepCual understanding of the 
construct . 

The passing rates, in particular, provided especially informative 
evidence regarding the magnitude of group-difference issue. For total scores, 
which are averages across all five cas^^, the difference was considerable, 
with 70% of the fourth-year students passing and only 3% of the second-year 
studen*:.s passing. Clearly, the clinical experience in the clerkships was a 
virtual necessity for passing the five-case, clinical-competence examination. 
The magnitude of the difference ■ between the passing rates (3% versus 70%) 
would seem to ba consistent with our intuitive expectation of the magnitude of 
the difference in clinical competence between second-year and fourth-year 
students, resulting from their first major clinical experience. Similarly, 
the odds ratio for this difference was 78.20, indicating that the odds of a 
fourth-year student passing the examination is 78 times greater than the odds 
of a second-year student passing. Again, the magnitude of effect given by the 
odds ratio would seem to be consistent with our intuitive expectation, 
indicating a sizeable effect. Even for means, the difference between the 
total score means was 2.19 standard deviations, indicating that average 
performance of fourth-year students was over 2 standard deviations higher than 
that of second-year students. 

Conclusion 

The significance of the study is that the results show sizeable 
differences between the second- and fourth-year groups (i.e., groups assessed 
before and after their clinical clerkships, respectively) and thus provide 
good support for the construct validity of the Post-Clerkship Examination as a 
measure of clinical competence. Moreover, the paper demonstrates the 
usefulness of passing rates and effect-size measures as a means of enhancing 
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the typically weak evidence for validity provided by group-differences 
studies. For example, the comparison of passing rates provides a more 
intuitively meaningful estimate of the magnitude of the effect of the 
clerkship rotations on clinical competence. The use of odds ratios with the 
passing rates provides another intuitively meaningful way of estimating the 
effect size. Even the expression of mean differences in standard deviation 
units adds to the meaning of usual significance test assessment of group- 
differences . The results obtained with these approaches show that the 
clerkships are having a considerable effect on clinical competence as would be 
expected and that the Post-Clerkship Examination is sensitive to these changes 
in the clinical-competence constinact as a valid measure of clinical competence 
should. 
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Table 1. 



Means (± standard deviations) and passing rates with p values and strength of effect measures (d for 
means and OR for passing rates) for second-year (n = 70) and fourth-year (n = 66) students on five 
SP cases, d is difference between means in pooled standard deviation units and OR is odds ratio. 



Second-year Fourth-year Strength 



Means: 



Cascl 67.79 73.26 .0010 i8 

(±9.28) (±9.69) 

Case 2 57.36 . 70.58 .0001 • 1.29 

(±9.92) (±10.67) 

Cases 65.47 71.39 .0145 .42 

(±13.97) (±13.89) 

Case 4 60.63 84.27 .0001 2:23 

(±11J2) (±9.58) 

Cases 60.93 73.45 .0001 1.21 

(±9.90) (±10.77) 

Total 62.43 74.59 .0001 2.19 

(±5.86) (±5.21) 



Passing Rates: OR 

Case 1 41% 68% .002 3.03 

Case 2 37% 80% .001 6.90 

Case 3 23% 44% .009 2.65 

Case 4 14% 89% .001 50.57 

Cases 1% 38% .001 42.07 

Total 3% 70% .001 78.20 



ERIC 



12 



13 



